Automatic Speech Recognition for Uyghur, Kazakh, and Kyrgyz: An Overview
نویسندگان
چکیده
With the emergence of deep learning, performance automatic speech recognition (ASR) systems has remarkably improved. Especially for resource-rich languages such as English and Chinese, commercial usage been made feasible in a wide range applications. However, most are low-resource languages, presenting three main difficulties development ASR systems: (1) scarcity data; (2) uncertainty writing pronunciation; (3) individuality each language. Uyghur, Kazakh, Kyrgyz examples all involving clear geographical variation their pronunciation, language possesses its own unique acoustic properties phonological rules. On other hand, they belong to Altaic family branch, so share many commonalities. This paper presents an overview techniques developed Kyrgyz, with purposes highlighting that specifically effective generally them discovering important factors promoting research by comparative study path these neighboring languages.
منابع مشابه
Audio-Visual Automatic Speech Recognition: An Overview
We have made significant progress in automatic speech recognition (ASR) for well-defined applications like dictation and medium vocabulary transaction processing tasks in relatively controlled environments. However, ASR performance has yet to reach the level required for speech to become a truly pervasive user interface. Indeed, even in “clean” acoustic environments, and for a variety of tasks,...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملLecture 4: Overview of Automatic Speech Recognition
3 Acoustic Model Training Using HTK 7 3.1 Transcriptions: Master Label File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Database File Listing: Script Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3 HMM Listing: PHF file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.4 Dictionaries . . . . . . . . . . ...
متن کاملOverview of speech enhancement techniques for automatic speaker recognition
Real world conditions differ from ideal or laboratory conditions, causing mismatch between training and testing phases, and consequently, inducing performance degradation in automatic speaker recognition systems [1]. Many strategies have been adopted to cope with acoustical degradation; in some applications of speaker identification systems a clean sample of speech, prior to the recognition sta...
متن کاملAn Overview of Speech Recognition and Speech Synthesis Algorithms
This paper describes about some speech synthesis and speech recognition algorithms and compares their performance based on accuracy and quality. In speech recognition DTW and HMM algorithms are compared with respect to accuracy. Comparative study of CELP and MBROLA algorithm of speech synthesis based on quality is also done.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2022
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app13010326